414 research outputs found
Data-driven root-cause analysis for distributed system anomalies
Modern distributed cyber-physical systems encounter a large variety of
anomalies and in many cases, they are vulnerable to catastrophic fault
propagation scenarios due to strong connectivity among the sub-systems. In this
regard, root-cause analysis becomes highly intractable due to complex fault
propagation mechanisms in combination with diverse operating modes. This paper
presents a new data-driven framework for root-cause analysis for addressing
such issues. The framework is based on a spatiotemporal feature extraction
scheme for distributed cyber-physical systems built on the concept of symbolic
dynamics for discovering and representing causal interactions among subsystems
of a complex system. We present two approaches for root-cause analysis, namely
the sequential state switching (, based on free energy concept of a
Restricted Boltzmann Machine, RBM) and artificial anomaly association (, a
multi-class classification framework using deep neural networks, DNN).
Synthetic data from cases with failed pattern(s) and anomalous node are
simulated to validate the proposed approaches, then compared with the
performance of vector autoregressive (VAR) model-based root-cause analysis.
Real dataset based on Tennessee Eastman process (TEP) is also used for
validation. The results show that: (1) and approaches can obtain
high accuracy in root-cause analysis and successfully handle multiple nominal
operation modes, and (2) the proposed tool-chain is shown to be scalable while
maintaining high accuracy.Comment: 6 pages, 3 figure
Online Robust Policy Learning in the Presence of Unknown Adversaries
The growing prospect of deep reinforcement learning (DRL) being used in
cyber-physical systems has raised concerns around safety and robustness of
autonomous agents. Recent work on generating adversarial attacks have shown
that it is computationally feasible for a bad actor to fool a DRL policy into
behaving sub optimally. Although certain adversarial attacks with specific
attack models have been addressed, most studies are only interested in off-line
optimization in the data space (e.g., example fitting, distillation). This
paper introduces a Meta-Learned Advantage Hierarchy (MLAH) framework that is
attack model-agnostic and more suited to reinforcement learning, via handling
the attacks in the decision space (as opposed to data space) and directly
mitigating learned bias introduced by the adversary. In MLAH, we learn separate
sub-policies (nominal and adversarial) in an online manner, as guided by a
supervisory master agent that detects the presence of the adversary by
leveraging the advantage function for the sub-policies. We demonstrate that the
proposed algorithm enables policy learning with significantly lower bias as
compared to the state-of-the-art policy learning approaches even in the
presence of heavy state information attacks. We present algorithm analysis and
simulation results using popular OpenAI Gym environments.Comment: 18 pages, 9 figure
An unsupervised spatiotemporal graphical modeling approach to anomaly detection in distributed CPS
Modern distributed cyber-physical systems (CPSs) encounter a large variety of
physical faults and cyber anomalies and in many cases, they are vulnerable to
catastrophic fault propagation scenarios due to strong connectivity among the
sub-systems. This paper presents a new data-driven framework for system-wide
anomaly detection for addressing such issues. The framework is based on a
spatiotemporal feature extraction scheme built on the concept of symbolic
dynamics for discovering and representing causal interactions among the
subsystems of a CPS. The extracted spatiotemporal features are then used to
learn system-wide patterns via a Restricted Boltzmann Machine (RBM). The
results show that: (1) the RBM free energy in the off-nominal conditions is
different from that in the nominal conditions and can be used for anomaly
detection; (2) the framework can capture multiple nominal modes with one
graphical model; (3) the case studies with simulated data and an integrated
building system validate the proposed approach.Comment: ICCPS 201
Semantic Adversarial Attacks: Parametric Transformations That Fool Deep Classifiers
Deep neural networks have been shown to exhibit an intriguing vulnerability
to adversarial input images corrupted with imperceptible perturbations.
However, the majority of adversarial attacks assume global, fine-grained
control over the image pixel space. In this paper, we consider a different
setting: what happens if the adversary could only alter specific attributes of
the input image? These would generate inputs that might be perceptibly
different, but still natural-looking and enough to fool a classifier. We
propose a novel approach to generate such `semantic' adversarial examples by
optimizing a particular adversarial loss over the range-space of a parametric
conditional generative model. We demonstrate implementations of our attacks on
binary classifiers trained on face images, and show that such natural-looking
semantic adversarial examples exist. We evaluate the effectiveness of our
attack on synthetic and real data, and present detailed comparisons with
existing attack methods. We supplement our empirical results with theoretical
bounds that demonstrate the existence of such parametric adversarial examples.Comment: Accepted to International Conference on Computer Vision, (ICCV) 201
Collaborative Deep Learning in Fixed Topology Networks
There is significant recent interest to parallelize deep learning algorithms
in order to handle the enormous growth in data and model sizes. While most
advances focus on model parallelization and engaging multiple computing agents
via using a central parameter server, aspect of data parallelization along with
decentralized computation has not been explored sufficiently. In this context,
this paper presents a new consensus-based distributed SGD (CDSGD) (and its
momentum variant, CDMSGD) algorithm for collaborative deep learning over fixed
topology networks that enables data parallelization as well as decentralized
computation. Such a framework can be extremely useful for learning agents with
access to only local/private data in a communication constrained environment.
We analyze the convergence properties of the proposed algorithm with strongly
convex and nonconvex objective functions with fixed and diminishing step sizes
using concepts of Lyapunov function construction. We demonstrate the efficacy
of our algorithms in comparison with the baseline centralized SGD and the
recently proposed federated averaging algorithm (that also enables data
parallelism) based on benchmark datasets such as MNIST, CIFAR-10 and CIFAR-100
Root-cause Analysis for Time-series Anomalies via Spatiotemporal Graphical Modeling in Distributed Complex Systems
Performance monitoring, anomaly detection, and root-cause analysis in complex
cyber-physical systems (CPSs) are often highly intractable due to widely
diverse operational modes, disparate data types, and complex fault propagation
mechanisms. This paper presents a new data-driven framework for root-cause
analysis, based on a spatiotemporal graphical modeling approach built on the
concept of symbolic dynamics for discovering and representing causal
interactions among sub-systems of complex CPSs. We formulate the root-cause
analysis problem as a minimization problem via the proposed inference based
metric and present two approximate approaches for root-cause analysis, namely
the sequential state switching (, based on free energy concept of a
restricted Boltzmann machine, RBM) and artificial anomaly association (, a
classification framework using deep neural networks, DNN). Synthetic data from
cases with failed pattern(s) and anomalous node(s) are simulated to validate
the proposed approaches. Real dataset based on Tennessee Eastman process (TEP)
is also used for comparison with other approaches. The results show that: (1)
and approaches can obtain high accuracy in root-cause analysis
under both pattern-based and node-based fault scenarios, in addition to
successfully handling multiple nominal operating modes, (2) the proposed
tool-chain is shown to be scalable while maintaining high accuracy, and (3) the
proposed framework is robust and adaptive in different fault conditions and
performs better in comparison with the state-of-the-art methods.Comment: 42 pages, 5 figures. arXiv admin note: text overlap with
arXiv:1605.0642
Flow Shape Design for Microfluidic Devices Using Deep Reinforcement Learning
Microfluidic devices are utilized to control and direct flow behavior in a
wide variety of applications, particularly in medical diagnostics. A
particularly popular form of microfluidics -- called inertial microfluidic flow
sculpting -- involves placing a sequence of pillars to controllably deform an
initial flow field into a desired one. Inertial flow sculpting can be formally
defined as an inverse problem, where one identifies a sequence of pillars
(chosen, with replacement, from a finite set of pillars, each of which produce
a specific transformation) whose composite transformation results in a
user-defined desired transformation. Endemic to most such problems in
engineering, inverse problems are usually quite computationally intractable,
with most traditional approaches based on search and optimization strategies.
In this paper, we pose this inverse problem as a Reinforcement Learning (RL)
problem. We train a DoubleDQN agent to learn from this environment. The results
suggest that learning is possible using a DoubleDQN model with the success
frequency reaching 90% in 200,000 episodes and the rewards converging. While
most of the results are obtained by fixing a particular target flow shape to
simplify the learning problem, we later demonstrate how to transfer the
learning of an agent based on one target shape to another, i.e. from one design
to another and thus be useful for a generic design of a flow shape.Comment: Neurips 2018 Deep RL worksho
Deep Value of Information Estimators for Collaborative Human-Machine Information Gathering
Effective human-machine collaboration can significantly improve many learning
and planning strategies for information gathering via fusion of 'hard' and
'soft' data originating from machine and human sensors, respectively. However,
gathering the most informative data from human sensors without task overloading
remains a critical technical challenge. In this context, Value of Information
(VOI) is a crucial decision-theoretic metric for scheduling interaction with
human sensors. We present a new Deep Learning based VOI estimation framework
that can be used to schedule collaborative human-machine sensing with
computationally efficient online inference and minimal policy hand-tuning.
Supervised learning is used to train deep convolutional neural networks (CNNs)
to extract hierarchical features from 'images' of belief spaces obtained via
data fusion. These features can be associated with soft data query choices to
reliably compute VOI for human interaction. The CNN framework is described in
detail, and a performance comparison to a feature-based POMDP scheduling policy
is provided. The practical feasibility of our method is also demonstrated on a
mobile robotic search problem with language-based semantic human sensor inputs.Comment: 10 pages, to appear in ICCPS 201
A Forward-Backward Approach for Visualizing Information Flow in Deep Networks
We introduce a new, systematic framework for visualizing information flow in
deep networks. Specifically, given any trained deep convolutional network model
and a given test image, our method produces a compact support in the image
domain that corresponds to a (high-resolution) feature that contributes to the
given explanation. Our method is both computationally efficient as well as
numerically robust. We present several preliminary numerical results that
support the benefits of our framework over existing methods.Comment: Presented at NIPS 2017 Symposium on Interpretable Machine Learnin
Deep Action Sequence Learning for Causal Shape Transformation
Deep learning became the method of choice in recent year for solving a wide
variety of predictive analytics tasks. For sequence prediction, recurrent
neural networks (RNN) are often the go-to architecture for exploiting
sequential information where the output is dependent on previous computation.
However, the dependencies of the computation lie in the latent domain which may
not be suitable for certain applications involving the prediction of a
step-wise transformation sequence that is dependent on the previous computation
only in the visible domain. We propose that a hybrid architecture of
convolution neural networks (CNN) and stacked autoencoders (SAE) is sufficient
to learn a sequence of actions that nonlinearly transforms an input shape or
distribution into a target shape or distribution with the same support. While
such a framework can be useful in a variety of problems such as robotic path
planning, sequential decision-making in games, and identifying material
processing pathways to achieve desired microstructures, the application of the
framework is exemplified by the control of fluid deformations in a microfluidic
channel by deliberately placing a sequence of pillars. Learning of a multistep
topological transform has significant implications for rapid advances in
material science and biomedical applications
- …